Synthetic-perturbation Techniques for Screening Shared Memory Programs
نویسندگان
چکیده
The synthetic-perturbation screening (SPS) methodology is based on an empirical approach; SPS introduces artificial perturbations into the MIMD program and captures the effects of such perturbations by using the modern branch of statistics called design of experiments. SPS can provide the basis of a powerful tool for screening MIMD programs for performance bottlenecks. This technique is portable across machines and architectures, and scales extremely well on massively parallel processors. The purpose of this paper is to explain the general approach and to extend it to address specific features that are the main source of poor performance on the shared memory programming model. These include performance degradation due to load imbalance and insufficient parallelism, and overhead introduced by synchronizations and by accessing shared data structures. We illustrate the practicality of SPS by demonstrating its use on two very different case studies: a large image understanding benchmark and a parallel quicksort.
منابع مشابه
Development of Technique for Healing Data Races based on Software Transactional Memory
Data races in multi-threaded programs may occur when multiple accesses on different threads access a shared location without proper synchronization, and one of them is a store. It is difficult to develop data race free programs and to manually fix existing data races in the programs, because they may lead to unpredictable results to the programmer. This paper presents a technique that heals dat...
متن کاملSystem Software Support for Reducing Memory Latency on Distributed Shared Memory Multiprocessors
This paper overviews results from our recent work on building customized system software support for Distributed Shared Memory Multiprocessors. The mechanisms and policies outlined in this paper are connected with a single conceptual thread: they all attempt to reduce the memory latency of parallel programs by optimizing critical system services, while hiding the complex architectural details o...
متن کاملTechniques and Tools for Distributed Shared Memory Performance Improvement
Distributed shared memory (DSM) systems hide the details of communication for parallel applications by providing a shared virtual memory space kept coherent via message passing. Although easy to program, performance is poor when memory contention is high. Performance can be improved by detecting and predicting future memory access patterns, such as migratory, producer-consumer, and grouped read...
متن کاملSecond - level Instruction Cache Thread Processing Unit Thread Processing Unit Thread Processing Unit Instruction Cache First - level First - level First - level Instruction Cache Instruction Cache Execution
This paper presents a new parallelization model, called coarse-grained thread pipelining, for exploiting speculative coarse-grained parallelism from general-purpose application programs in shared-memory multiprocessor systems. This parallelization model, which is based on the ne-grained thread pipelining model proposed for the superthreaded architecture 11, 12], allows concurrent execution of l...
متن کاملCompilation Techniques for Fair Execution of Shared Memory Parallel Programs over a Network of Workstations
Compiler technologies are crucial for the eecient execution of sequential programs. This is not yet true for parallel programs, where the operating system performs most of the work, resulting in increased overhead for scheduling and distributed shared memory simulation. In this work we suggest simple compilation techniques that can be used to guarantee eecient execution of shared memory paralle...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Softw., Pract. Exper.
دوره 24 شماره
صفحات -
تاریخ انتشار 1994